Notes: Setting up my R environment by loading the ‘ggplot2’ and ‘palmerpenguins’ packages
library(ggplot2)
library("palmerpenguins")
Example 1:
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))
Example 2:
ggplot(data = penguins) +
geom_point(mapping = aes(x = bill_length_mm, y = bill_depth_mm))
This will map colors into specific variable (species)
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species))
This will map different shape for every variable (species)
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, shape = species, color = species))
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species, size = species))
This controls the transparency of the points. Alpha is a good option when you got a dense plot with lots of data points
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, alpha = species))
Note: if we want to change the overall appearance of our plot without regard specific variable, we write code outside of the aes() function.
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, alpha = species), color = "blue")
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g))
Note: R automatically counts how many times x value appears in the data and then shows the count on the y axis.
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut))
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, color = cut))
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut))
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity))
ggplot(data = penguins) +
geom_line(mapping = aes(x = flipper_length_mm, y = body_mass_g))
Note: geom_smooth is useful for showing general trends in our data.
ggplot(data = penguins) +
geom_smooth(mapping = aes(x = flipper_length_mm, y = body_mass_g))
Note: To plot separate line for each species of penguins, add linetype aesthetic to the code
ggplot(data = penguins) +
geom_smooth(mapping = aes(x = flipper_length_mm, y = body_mass_g, linetype = species))
ggplot(data = penguins) +
geom_smooth(mapping = aes(x = flipper_length_mm, y = body_mass_g)) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g), color = "black")
Creates a scatter plot and then adds a small amount of random noise to each point in the plot.
Note: Jittering helps us deal with overplotting which happens when the datapoints in a plot overlap with each other. It makes the points easier to find
ggplot(data = penguins) +
geom_jitter(mapping = aes(x = flipper_length_mm, y = body_mass_g))
Lets you display smaller groups, or subsets, of your data. Faceting can help you discover new patterns in your data and focus on relationships between different variables.
Is used to facet your plot by a SINGLE VARIABLE
Example 1:
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
facet_wrap(~species)
Example 2:
ggplot(data = diamonds) +
geom_bar(mapping = aes(x = color, fill = cut)) +
facet_wrap(~cut)
Is used to facet your plot with TWO VARIABLES Note: R facets vertically by the values of the first variable and horizontally by the values of second variable
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
facet_grid(sex~species)
Labels: Are titles, subtitles and captions that we put OUTSIDE OF THE GRID of our plot to indicate important information.
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
labs(title = "Palmer Penguins: Body Mass vs. Flipper Length")
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
labs(title = "Palmer Penguins: Body Mass vs. Flipper Length", subtitle = "Sample of Three Penguin Species")
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
labs(title = "Palmer Penguins: Body Mass vs. Flipper Length", subtitle = "Sample of Three Penguin Species", caption = "Data collected by Dr. Kristen Gorman" )
Annotations: Is used to put texts INSIDE THE GRID to explain or comment upon specific data points. In ggplot2, it can help explain the plot’s purpose or highlight important data.
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species))+
labs(title = "Palmer Penguins: Body Mass vs. FLipper Length", subtitle = "Sample of Three Penguin Species", caption = "Data collected by Dr. Kristen Gorman") +
annotate("text", x = 220, y = 3500, label = "The Gentoos are the largest")
ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species))+
labs(title = "Palmer Penguins: Body Mass vs. FLipper Length", subtitle = "Sample of Three Penguin Species", caption = "Data collected by Dr. Kristen Gorman") +
annotate("text", x = 220, y = 3500, label = "The Gentoos are the largest", color = "blue", fontface = "bold", size = 3, angle = 15)
p <- ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species))+
labs(title = "Palmer Penguins: Body Mass vs. FLipper Length", subtitle = "Sample of Three Penguin Species", caption = "Data collected by Dr. Kristen Gorman")
p+ annotate("text", x = 220, y = 3500, label = "The Gentoos are the largest", color = "blue", fontface = "bold", size = 3, angle = 15)
Approach 1: Export option in the plots tab
Approach 2: ggsave() function
ggsave("Three Penguin Species.png")